---
description: The Algolia way to build a site search for any website using a crawler.
title: Algolia site search
createdAt: 2019-03-09
updatedAt: 2019-03-18
contributors:
  - dzello
---

# Introduction

Need a site search? If you have content that you want users to find, the answer is yes! Thankfully it's not hard to add a basic site search using Algolia, even if your content doesn't live in a well-organized CMS or database. This awesome stack is a collection of all of the various tools you'll need to crawl and scrape your site's content, upload it to Algolia, then add a search box to your UI.

# Search Engine

<Tools>
  <StackShare name="Algolia">
    Algolia's Community plan is free up to 10,000 records with unlimited search queries.
  </StackShare>
  <GitHub name="DeuxHuitHuit/algolia-webcrawler">
    Crawls the sitemap and uses CSS selectors to extract elements from the page, then pushes the data to a configured Algolia index. Written in Node.js. 
  </GitHub>
</Tools>

## Resources

- [Algolia Docs - How Algolia Works](https://www.algolia.com/doc/guides/getting-started/how-algolia-works/)
- [Algolia Docs - Prepare Your Data - Format and Structure](https://www.algolia.com/doc/guides/sending-and-managing-data/prepare-your-data/)

# Crawling

<Tools>
  <GitHub name="DeuxHuitHuit/algolia-webcrawler">
    Crawls the sitemap and uses CSS selectors to extract elements from the page, then pushes the data to a configured Algolia index. Written in Node.js. 
  </GitHub>
  <GitHub name="algolia/docsearch">
    An open source project from Algolia that crawls a website and uploads it to an index. Used mostly for documentation but can be adapted to general websites.
  </GitHub>
  <GitHub name="scrapy/scrapy">
    A popular crawling and scraping tool for Python. This will help you get the data off your site, then you'll use an Algolia SDK to upload it.
  </GitHub>
  <GitHub name="matthewmueller/x-ray">
    Like scrapy but for Node.js. Syntax is composable - takes a bit to get used to but if very powerful.
  </GitHub>
</Tools>

## Resources

- [The Ultimate Guide to Web Scraping with Node.js](https://medium.freecodecamp.org/the-ultimate-guide-to-web-scraping-with-node-js-daa2027dcd3)
- [How To Crawl A Web Page with Scrapy and Python 3](https://www.digitalocean.com/community/tutorials/how-to-crawl-a-web-page-with-scrapy-and-python-3)
- [Algolia Docs - Indexing Long Documents](https://www.algolia.com/doc/guides/sending-and-managing-data/prepare-your-data/how-to/indexing-long-documents/)
- [chunk-text - chunk/split a string by length without cutting/truncating words](https://github.com/algolia/chunk-text)

# Static site integrations

If you're using a common static site generator to build your site, there’s a good chance Algolia or the community has created a pre-built integration you can use.

<Tools>
  <GitHub name="algolia/jekyll-algolia">
    An official Algolia-supported way to create a search for websites based on jekyll.
  </GitHub>
  <GitHub name="algolia/gatsby-plugin-algolia">
    A Gatsby plugin that lets you query for GraphQL site content and convert it into Algolia records.
  </GitHub>
  <GitHub name="replicatedhq/hugo-algolia">
    For the Hugo users, an alternative to the DocSearch plugin that allows more customization.
  </GitHub>
</Tools>

## Resources

- [Static site search with Algolia and Hugo](https://forestry.io/blog/search-with-algolia-in-hugo/)
- [Jekyll search with Algolia and webtasks](https://forestry.io/blog/search-with-algolia-in-jekyll/)
- [Custom search with Algolia in Gatsby](https://janosh.io/blog/gatsby-algolia-search)
- [Gatsby Docs - Adding Search](https://www.gatsbyjs.org/docs/adding-search/)

# User interface

<Tools>
  <GitHub name="algolia/instantsearch.js">
  This is the "vanilla JS" version of InstantSearch, meaning that it doesn't need any JavaScript framework to work.
  </GitHub>
  <GitHub name="algolia/react-instantsearch">
    If you're already using React, maybe through Gatsby, this library will feel very comfortable.
  </GitHub>
  <GitHub name="algolia/vue-instantsearch">
    The Vue.js version of InstantSearch.
  </GitHub>
  <GitHub name="algolia/angular-instantsearch">
    The Angular version of InstantSearch, compatible with Angular 5 and above.
  </GitHub>
  <GitHub name="algolia/autocomplete.js">
    Use for a standard drop-down interface, good if you’re putting a search bar in the site header
  </GitHub>
</Tools>

## Resources

- [Algolia Docs - What is InstantSearch.js](https://www.algolia.com/doc/guides/building-search-ui/what-is-instantsearch/js/)
- [Video - Build an instant search result page](https://www.youtube.com/watch?v=lN0-mnwyfrE)
- [Search interface - 20 things to consider](https://uxplanet.org/search-interface-20-things-to-consider-4b1466e98881)

# Utilities

<Tools>
  <GitHub name="stedolan/jq">
    A swiss army knife for manipulating JSON on the command line. Useful to transform records before uploading to Algolia.
  </GitHub>
  <GitHub name="Shipow/searchbox">
    Generate the code for a search box that matches your site’s design and colors.
  </GitHub>
  <StackShare name="yarn"></StackShare>
</Tools>